Quantum-Enhanced Reinforcement Learning for Finite-Episode Games with Discrete State Spaces
نویسندگان
چکیده
Quantum annealing algorithms belong to the class of metaheuristic tools, applicable for solving binary optimization problems. Hardware implementations of quantum annealing, such as the quantum annealing machines produced by D-Wave Systems [1], have been subject to multiple analyses in research, with the aim of characterizing the technology’s usefulness for optimization and sampling tasks [2–16]. Here, we present a way to partially embed both Monte Carlo policy iteration for finding an optimal policy on random observations, as well as how to embed n sub-optimal state-value functions for approximating an improved state-value function given a policy for finite horizon games with discrete state spaces on a D-Wave 2000Q quantum processing unit (QPU). We explain how both problems can be expressed as a quadratic unconstrained binary optimization (QUBO) problem, and show that quantum-enhanced Monte Carlo policy evaluation allows for finding equivalent or better state-value functions for a given policy with the same number episodes compared to a purely classical Monte Carlo algorithm. Additionally, we describe a quantum-classical policy learning algorithm. Our first and foremost aim is to explain how to represent and solve parts of these problems with the help of the QPU, and not to prove supremacy over every existing classical policy evaluation algorithm. ∗Corresponding author: [email protected] 1 ar X iv :1 70 8. 09 35 4v 1 [ qu an tph ] 3 0 A ug 2 01 7
منابع مشابه
Reinforcement Learning In Real-Time Strategy Games
We consider the problem of effective and automated decisionmaking in modern real-time strategy (RTS) games through the use of reinforcement learning techniques. RTS games constitute environments with large, high-dimensional and continuous state and action spaces with temporally-extended actions. To operate under such environments we propose Exlos, a stable, model-based MonteCarlo method. Contra...
متن کاملMulitagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces
We investigate the learning problem in stochastic games with continuous action spaces. We focus on repeated normal form games, and discuss issues in modelling mixed strategies and adapting learning algorithms in finite-action games to the continuous-action domain. We applied variable resolution techniques to two simple multi-agent reinforcement learning algorithms PHC and MinimaxQ. Preliminary ...
متن کاملReinforcement Learning in Continuous State and Action Spaces
Many traditional reinforcement-learning algorithms have been designed for problems with small finite state and action spaces. Learning in such discrete problems can been difficult, due to noise and delayed reinforcements. However, many real-world problems have continuous state or action spaces, which can make learning a good decision policy even more involved. In this chapter we discuss how to ...
متن کاملReinforcement Learning Based Algorithms for Average Cost Markov Decision Processes
This article proposes several two-timescale simulation-based actor-critic algorithms for solution of infinite horizon Markov Decision Processes with finite state-space under the average cost criterion. Two of the algorithms are for the compact (non-discrete) action setting while the rest are for finite-action spaces. On the slower timescale, all the algorithms perform a gradient search over cor...
متن کاملA Convergent Reinforcement Learning Algorithm in the Continuous Case: The Finite-Element Reinforcement Learning
This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcement Learning, in the continuous case, i.e. continuous state-space and time. The evaluation of the value function enables the generation of an optimal policy for reinforcement control problems, such as target or obstacle problems, viability problems or optimization problems. We propose a continuous for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1708.09354 شماره
صفحات -
تاریخ انتشار 2018